Fp8 hipstream fix #3127

acoskunses-AMD · 2024-09-12T21:19:38Z

Pick hip stream where cuda is not defined

netlify · 2024-09-12T21:19:56Z

❌ Deploy Preview for pytorch-fbgemm-docs failed.

Name	Link
🔨 Latest commit	`1e4e129`
🔍 Latest deploy log	https://app.netlify.com/sites/pytorch-fbgemm-docs/deploys/6706fc278bdd20000858ec29

acoskunses-AMD · 2024-09-12T21:39:37Z

@jianyuh @jwfromm

xw285cornell · 2024-09-19T20:57:39Z

We use hipify script for this, is the change here needed?

jianyuh · 2024-09-19T21:55:11Z

Rebase to resolve conflicts: also as Xiaodong mentioned, internally we don't need this (automatically hipify). Is this needed on OSS workflow? I guess hipify torch is not enabled?

xw285cornell · 2024-09-20T02:23:29Z

fbgemm_gpu/experimental/gen_ai/src/quantize/ck_extensions/kernels/fp8_rowwise_common.h

@@ -12,10 +12,13 @@
 #include <numeric>

 #include <ATen/ATen.h>
+#if !defined(USE_ROCM)


hipify script should hipify .h right?

in any case, I think maybe we can just remove this block just to be clean.

xw285cornell · 2024-09-20T02:25:53Z

fbgemm_gpu/experimental/gen_ai/src/quantize/ck_extensions/fp8_blockwise_gemm.hip

@@ -12,11 +12,14 @@
 #include <numeric>

 #include <ATen/ATen.h>
-#include <c10/hip/HIPStream.h>
+#if !defined(USE_ROCM)


is this block even needed? If USE_ROCM is not defined, it'll basically be an empty file. So we can just remove this block here.

Let's just remove this include and move it down to the USE_ROCM. No need to include cuda header here.

@xw285cornell @jianyuh

changed as required. Please review.

xw285cornell · 2024-09-20T02:27:30Z

fbgemm_gpu/experimental/gen_ai/src/quantize/ck_extensions/kernels/fp8_rowwise_common.h

@@ -12,10 +12,13 @@
 #include <numeric>

 #include <ATen/ATen.h>
+#if !defined(USE_ROCM)


in any case, I think maybe we can just remove this block just to be clean.

facebook-github-bot · 2024-09-30T21:52:56Z

@xw285cornell has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

xw285cornell

I guess we should never use nvcc to build those files so we should be ok?

xw285cornell · 2024-10-09T07:55:46Z

fbgemm_gpu/experimental/gen_ai/src/quantize/ck_extensions/kernels/fp8_rowwise_common.h

@@ -201,4 +199,3 @@ at::Tensor f8f8bf16_rowwise_impl(
  return Y;
 }



@acoskunses-AMD can you remove the blank lines at the end of each file? Otherwise we cannot land it internally due to some lint check

facebook-github-bot · 2024-10-10T00:53:40Z

@xw285cornell has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2024-10-10T02:50:47Z

@xw285cornell merged this pull request in 8e7beba.

facebook-github-bot added the cla signed label Sep 12, 2024

acoskunses-AMD force-pushed the fp8_hipstream_fix branch 2 times, most recently from 0c9fc22 to 863f596 Compare September 12, 2024 21:37

acoskunses-AMD force-pushed the fp8_hipstream_fix branch from c948fe8 to e94d9e9 Compare September 19, 2024 23:08

xw285cornell reviewed Sep 20, 2024

View reviewed changes

acoskunses-AMD force-pushed the fp8_hipstream_fix branch 3 times, most recently from ee380a4 to 05e9b76 Compare September 30, 2024 21:39

xw285cornell approved these changes Sep 30, 2024

View reviewed changes

xw285cornell reviewed Oct 9, 2024

View reviewed changes

removing redundant ROCM checks from hip files

1e4e129

acoskunses-AMD force-pushed the fp8_hipstream_fix branch from 05e9b76 to 1e4e129 Compare October 9, 2024 21:56

facebook-github-bot closed this in 8e7beba Oct 10, 2024

facebook-github-bot added the Merged label Oct 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fp8 hipstream fix #3127

Fp8 hipstream fix #3127

acoskunses-AMD commented Sep 12, 2024

netlify bot commented Sep 12, 2024 •

edited

Loading

acoskunses-AMD commented Sep 12, 2024

xw285cornell commented Sep 19, 2024

jianyuh commented Sep 19, 2024

xw285cornell Sep 20, 2024

xw285cornell Sep 20, 2024

xw285cornell Sep 20, 2024

xw285cornell Sep 20, 2024

acoskunses-AMD Sep 30, 2024

xw285cornell Sep 20, 2024

facebook-github-bot commented Sep 30, 2024

xw285cornell left a comment

xw285cornell Oct 9, 2024

facebook-github-bot commented Oct 10, 2024

facebook-github-bot commented Oct 10, 2024

		@@ -201,4 +199,3 @@ at::Tensor f8f8bf16_rowwise_impl(
		return Y;
		}

Fp8 hipstream fix #3127

Fp8 hipstream fix #3127

Conversation

acoskunses-AMD commented Sep 12, 2024

netlify bot commented Sep 12, 2024 • edited Loading

❌ Deploy Preview for pytorch-fbgemm-docs failed.

acoskunses-AMD commented Sep 12, 2024

xw285cornell commented Sep 19, 2024

jianyuh commented Sep 19, 2024

xw285cornell Sep 20, 2024

Choose a reason for hiding this comment

xw285cornell Sep 20, 2024

Choose a reason for hiding this comment

xw285cornell Sep 20, 2024

Choose a reason for hiding this comment

xw285cornell Sep 20, 2024

Choose a reason for hiding this comment

acoskunses-AMD Sep 30, 2024

Choose a reason for hiding this comment

xw285cornell Sep 20, 2024

Choose a reason for hiding this comment

facebook-github-bot commented Sep 30, 2024

xw285cornell left a comment

Choose a reason for hiding this comment

xw285cornell Oct 9, 2024

Choose a reason for hiding this comment

facebook-github-bot commented Oct 10, 2024

facebook-github-bot commented Oct 10, 2024

netlify bot commented Sep 12, 2024 •

edited

Loading